Sample-weighted clustering methods
نویسندگان
چکیده
Keywords: Cluster analysis Maximum entropy principle k-means Fuzzy c-means Sample weights Robustness a b s t r a c t Although there have been many researches on cluster analysis considering feature (or variable) weights, little effort has been made regarding sample weights in clustering. In practice, not every sample in a data set has the same importance in cluster analysis. Therefore, it is interesting to obtain the proper sample weights for clustering a data set. In this paper, we consider a probability distribution over a data set to represent its sample weights. We then apply the maximum entropy principle to automatically compute these sample weights for clustering. Such method can generate the sample-weighted versions of most clustering algorithms, such as k-means, fuzzy c-means (FCM) and expectation & maximization (EM), etc. The proposed sample-weighted clustering algorithms will be robust for data sets with noise and outliers. Furthermore, we also analyze the convergence properties of the proposed algorithms. This study also uses some numerical data and real data sets for demonstration and comparison. Experimental results and comparisons actually demonstrate that the proposed sample-weighted clustering algorithms are effective and robust clustering methods.
منابع مشابه
Bilateral Weighted Fuzzy C-Means Clustering
Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...
متن کاملSample-Weighted Fuzzy Clustering with Regularizations
Although there have been many researches in cluster analysis to consider on feature weights, little effort is made on sample weights. Recently, Yu et al. (2011) considered a probability distribution over a data set to represent its sample weights and then proposed sample-weighted clustering algorithms. In this paper, we give a sample-weighted version of generalized fuzzy clustering regularizati...
متن کاملA Kernel Fuzzy Clustering Algorithm with Generalized Entropy Based on Weighted Sample
Aiming at fuzzy clustering with generalized entropy, a kernel fuzzy clustering algorithm with generalized entropy based on weighted sample is presented. By introducing weight of sample into objective function for fuzzy clustering with generalized entropy, we obtain optimization problem for fuzzy clustering with generalized entropy based on weighted sample. And we use Lagrange multiplier method ...
متن کاملWeighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering
Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...
متن کاملA Weighted Sample’s Fuzzy Clustering Algorithm With Generalized Entropy
Combined with weight of samples and kernel function, fuzzy clustering method with generalized entropy is studied. Objective function for fuzzy clustering with generalized entropy based on sample weighting is obtained. Following that, fuzzy clustering algorithm with generalized entropy based on sample weighting is presented. In addition, by introducing kernel into the presented objective functio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computers & Mathematics with Applications
دوره 62 شماره
صفحات -
تاریخ انتشار 2011